Search CORE

7 research outputs found

Okapi: Causally Consistent Geo-Replication Made Faster, Cheaper and More Available

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 14/02/2017
Field of study

Okapi is a new causally consistent geo-replicated key- value store. Okapi leverages two key design choices to achieve high performance. First, it relies on hybrid logical/physical clocks to achieve low latency even in the presence of clock skew. Second, Okapi achieves higher resource efficiency and better availability, at the expense of a slight increase in update visibility latency. To this end, Okapi implements a new stabilization protocol that uses a combination of vector and scalar clocks and makes a remote update visible when its delivery has been acknowledged by every data center. We evaluate Okapi with different workloads on Amazon AWS, using three geographically distributed regions and 96 nodes. We compare Okapi with two recent approaches to causal consistency, Cure and GentleRain. We show that Okapi delivers up to two orders of magnitude better performance than GentleRain and that Okapi achieves up to 3.5x lower latency and a 60% reduction of the meta-data overhead with respect to Cure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 25/02/2019
Field of study

Geo-replicated data platforms are at the backbone of several large-scale online services. Transactional Causal Consistency (TCC) is an attractive consistency level for building such platforms. TCC avoids many anomalies of eventual consistency, eschews the synchronization costs of strong consistency, and supports interactive read-write transactions. Partial replication is another attractive design choice for building geo-replicated platforms, as it increases the storage capacity and reduces update propagation costs. This paper presents PaRiS, the first TCC system that supports partial replication and implements non-blocking parallel read operations, whose latency is paramount for the performance of read-intensive applications. PaRiS relies on a novel protocol to track dependencies, called Universal Stable Time (UST). By means of a lightweight background gossip process, UST identifies a snapshot of the data that has been installed by every DC in the system. Hence, transactions can consistently read from such a snapshot on any server in any replication site without having to block. Moreover, PaRiS requires only one timestamp to track dependencies and define transactional snapshots, thereby achieving resource efficiency and scalability. We evaluate PaRiS on a large-scale AWS deployment composed of up to 10 replication sites. We show that PaRiS scales well with the number of DCs and partitions, while being able to handle larger data-sets than existing solutions that assume full replication. We also demonstrate a performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x higher throughput with 5.91x lower latency for read-dominated workloads and up to 1.46x higher throughput with 20.56x lower latency for write-heavy workloads)

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

PaRiS: Causally Consistent Transactions with Non-blocking Reads and Partial Replication

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 25/02/2019
Field of study

Infoscience - École polytechnique fédérale de Lausanne

The design of Wren, a Fast and Scalable Transactional Causally Consistent Geo-Replicated Key-Value Store

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 22/02/2017
Field of study

This paper presents the design of Wren, a new geo-replicated key-value store that achieves Transactional Causal Consistency. Wren leverages two design choices to achieve higher performance and better scalability than existing systems. First, Wren uses hybrid logical physical/clocks to timestamp data items. Hybrid clocks allow Wren to achieve low response times, by avoiding the latencies that existing systems based on physical clocks incur to cope with clock skew. Second, Wren relies on a novel dependency tracking and stabilization protocol, called Hybrid Stable Time (HST). HST uses only two scalar values per update regardless of the number of data centers and nodes within a data center. HST achieves high resource efficiency and scalability at the cost of a slight increase in remote update visibility latency. We discuss why Wren achieves higher performance and better scalability than state-of-the-art approaches

Infoscience - École polytechnique fédérale de Lausanne

Optimistic Causal Consistency for Geo-Replicated Key-Value Stores

Author: Didona Diego
Spirovska Kristina
Zwaenepoel Willy
Publication venue
Publication date: 22/02/2017
Field of study

Causal consistency is an attractive consistency model for geo-replicated data stores because it hits a sweet spot in the ease of programmability vs performance trade-off. In this paper we propose a new approach to causal consistency, which we call Optimistic Causal Consistency (OCC). The optimism of our approach lies in the fact that updates from a remote data center are immediately made visible to clients in the local data center. A client, hence, always reads the freshest version of an item, whose dependencies, however, might have not been installed in the local data center yet. When serving a read request, a server can detect whether it has not received such dependencies yet. This is achieved without inter-server synchronization thanks to cheap dependency meta-data supplied by the client. Upon detecting a missing dependency, the server waits to receive it. This approach contrasts with the design of existing systems, which are prone to expose stale versions of a data items, to ensure that clients only see versions whose dependencies have already been replicated in the local data center. OCC explores a novel trade-off in the landscape of consistency models. Because network partitions are practically rare events, OCC partially trades availability to improve other performance metrics. On the one side, OCC maximizes the freshness of data returned to clients and reduces the communication overhead. On the other side, a server might need to wait before serving a client’s request, leading the system to be unavailable in case of a network partition. To overcome this limitation, we propose a recovery mechanism that allows an OCC system to fall back to a pessimistic protocol to recover availability. We implement OCC in a new system, which we call POCC. We compare POCC against a recent (pessimistic) approach to causal consistency using heterogeneous workloads on an Amazon AWS deployment encompassing up to 96 nodes scattered over 3 data centers. We show that POCC is able to maximize the freshness of data returned to client while providing comparable or better performance than its pessimistic counterpart in a wide range of production-like workloads

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Efficient Protocols for Enforcing Causal Consistency in Geo-Replicated Key-Value Data Stores

Author: Spirovska Kristina
Publication venue: Lausanne, EPFL
Publication date: 16/06/2020
Field of study

Modern large-scale data platforms manage colossal amount of data, generated by the ever-increasing number of concurrent users. Geo-replicated and sharded key-value data stores play a central role when building such platforms. As the strongest consistency model proven not to compromise availability, causal consistency (CC) is perfectly positioned to address the needs of such large-scale systems, while preventing some ordering anomalies. Transactional Causal Consistency (TCC) augments CC by providing richer transactional semantics, simplifying the development of distributed applications. However, achieving CC/TCC in an efficient manner is very challenging. In this thesis we introduce several protocols and designs for high performance causally consistent geo-replicated key-value data stores in several different settings. First, we present a new approach to implementing CC in geo-replicated data stores, called Optimistic Causal Consistency (OCC). By introducing a technique that we call client-assisted lazy dependency resolution, OCC makes it possible for updates replicated to a remote data center to become visible immediately, without checking if their causal dependencies have been received. We further propose a recovery mechanism that allows an OCC system to fall back on a pessimistic protocol to continue operating during network partitions. We show that OCC improves data freshness, while offering performance that is comparable or better than its pessimistic counterpart. Next, we address the problem of providing low-latency TCC reads under full replication. We present Wren, the first TCC system that at the same time achieves low latency by implementing nonblocking read operations and efficiently scales out by sharding. Wren introduces new protocols for transaction execution, dependency tracking and stabilization. The transaction protocol supports nonblocking reads by providing a transaction snapshot as a union of a fresh causal snapshot installed by every partition in the local data center and a client-side cache for writes that are not yet included in the snapshot. The dependency tracking and stabilization protocols require only two scalar timestamps, resulting in efficient resource utilization and providing scalability in terms of shards and replication sites, under a full replication setting. In return for these benefits, Wren slightly increases the visibility latency of updates. Finally, we present PaRiS, the first TCC system that supports partial replication and provides low latency by implementing non-blocking parallel read operations. PaRiS relies on a novel protocol to track dependencies, called Universal Stable Time (UST). By means of a lightweight background gossip process, UST identifies a snapshot of the data that has been installed by every data center in the system. PaRiS equips clients with a private cache, in which they store their own updates that are not yet reflected in the snapshot. The combination of the UST-defined snapshot with client-side cache enables interactive transactions that can consistently read from any replication site without blocking. Moreover, PaRiS requires only one timestamp to track dependencies and define transactional snapshots, thereby achieving resource efficiency and scalability in terms of shards and replication sites, in a partial replication setting

Infoscience - École polytechnique fédérale de Lausanne

Distributed Transactional Systems Cannot Be Fast

Author: Akkoorath D. D.
Cowling James
Lamport Leslie
Lim Hyeontaek
Prince Mahajan Lorenzo Alvisi
Spirovska Kristina
Stonebraker Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/07/2019
Field of study

We prove that no fully transactional system can provide fast read transactions (including read-only ones that are considered the most frequent in practice). Specifically, to achieve fast read transactions, the system has to give up support of transactions that write more than one object. We prove this impossibility result for distributed storage systems that are causally consistent, i.e., they do not require to ensure any strong form of consistency. Therefore, our result holds also for any system that ensures a consistency level stronger than causal consistency, e.g., strict serializability. The impossibility result holds even for systems that store only two objects (and support at least two servers and at least four clients). It also holds for systems that are partially replicated. Our result justifies the design choices of state-of-the-art distributed transactional systems and insists that system designers should not put more effort to design fully-functional systems that support both fast read transactions and ensure causal or any stronger form of consistency

Infoscience - École polytechnique fédérale de Lausanne

Crossref